This article proposes the DataBook, a design pattern that utilizes Markdown to bridge the gap between large-scale RDF knowledge graphs and small, ephemeral, task-specific semantic content. By combining YAML frontmatter for metadata, inline identifiers for addressability, and typed fenced code blocks for data payloads, DataBooks create self-describing and portable semantic artifacts. The authors argue that this approach allows for a microdatabase model where structured data can exist without the overhead of a full triple store.
Key points include:
The use of Markdown as a substrate for semantic infrastructure.
Defining the microdatabase for small-scale, non-indexed knowledge work.
Inverting the LLM role to act as a transformation engine within a DataBook pipeline.
Implementing provenance through process stamps in YAML metadata.
Managing complex dependencies via manifest DataBooks and build graphs.
Supporting secure data transfer through designed-in encryption profiles.
This article explores the critical intersection of knowledge graphs and data lineage in the context of modern AI and machine learning. It examines how combining these two technologies can provide the transparency and traceability required to build trustworthy AI systems. By mapping the origins, transformations, and movements of data, organizations can ensure better data quality, regulatory compliance, and improved model interpretability.
Dr. Ora Lassila is a Principal Graph Technologist at AWS, working within the Amazon Neptune team with a primary focus on knowledge graphs. Throughout his extensive career, he has held significant roles, including Managing Director at State Street and positions at Nokia Research Center and HERE. A recognized pioneer in his field, he co-authored the original W3C RDF specification and the seminal article on the Semantic Web. His professional expertise covers AI, ontologies, the Semantic Web, RDF, and Knowledge Representation. In addition to his technical contributions, he is an enthusiast of aviation photography and scale modeling, even applying knowledge graph technologies to manage his aviation photography business, So Many Aircraft.
This article explores how to represent sentences as graphs, moving beyond traditional semantic modeling to a more natural-language oriented approach using reification and context graphs. It demonstrates how to translate sentences into RDF, Turtle, Open Cypher, and JSON-LD, highlighting the benefits of reification for capturing nuanced information and creating cleaner, more intuitive knowledge representations.
An introduction to semantic model-driven AI, exploring how SHACL (Shape Constraint Language) can improve the reliability of LLM responses by providing structure and constraints to data.
An exploration of SHACL 1.2 UI and its potential for creating forms and views, drawing parallels to the earlier XForms technology. The article discusses the benefits of declarative UI generation, dynamic properties, and security features.
The article explores SHACL 1.2 UI as a powerful, declarative approach to building forms and views for RDF data, drawing parallels to the earlier (and ultimately unsuccessful) XForms standard. The author argues that SHACL 1.2 UI offers benefits like consistent data presentation, automated form generation, dynamic property computation, and enhanced security, potentially revolutionizing how we interact with data on the web. While current tooling is limited, existing DASH-compatible tools can be adapted, and the author envisions a future where data itself dictates its presentation, reducing the need for costly and inconsistent manual form creation.
The article explores how modern AI agents are fulfilling the vision of the Semantic Web by combining AI's learned intuition with the logical structure of semantic technologies, creating intelligent agents that can understand and act on behalf of users.
This paper proposes the Knowledge Graph of Thoughts (KGoT) architecture for AI assistants, integrating LLM reasoning with dynamically constructed knowledge graphs to reduce costs and improve performance on complex tasks like the GAIA benchmark.